Computing Graph Descriptors on Edge Streams

نویسندگان

چکیده

Feature extraction is an essential task in graph analytics. These feature vectors, called descriptors, are used downstream vector-space-based analysis models. This idea has proved fruitful the past, with spectral-based descriptors providing state-of-the-art classification accuracy. However, known algorithms to compute meaningful do not scale large graphs since: (1) they require storing entire memory, and (2) end-user no control over algorithm’s runtime. In this article, we present streaming approximately three different capturing structure of graphs. Operating on edge streams allows us avoid controlling sample size enables keep runtime our within desired bounds. We demonstrate efficacy proposed by analyzing approximation error Our scalable millions edges minutes. Moreover, these yield predictive accuracy comparable methods but can be computed using only 25% as much memory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Clustering Graph Streams

In this paper, we will examine the problem of clustering massive graph streams. Graph clustering poses significant challenges because of the complex structures which may be present in the underlying data. The massive size of the underlying graph makes explicit structural enumeration very difficult. Consequently, most techniques for clustering multi-dimensional data are difficult to generalize t...

متن کامل

On Summarizing Graph Streams

Graph streams, which refer to the graph with edges being updated sequentially in a form of a stream, have wide applications such as cyber security, social networks and transportation networks. This paper studies the problem of summarizing graph streams. Specifically, given a graph stream G, directed or undirected, the objective is to summarize G as SG with much smaller (sublinear) space, linear...

متن کامل

Graph Mining on Streams

Multi-Pass Models: It is common in graph mining to consider algorithms that may take more than one pass over the stream. There has also been work in the W-Stream model in which the algorithm is allowed to write to the stream during each pass [9]. These annotations can then be utilized by the algorithm during successive passes and it can be shown that this gives sufficient power to the model for...

متن کامل

Computing on data streams

In this paper we study the space requirement of algorithms that make only one (or a small number of) pass(es) over the input data. We study such algorithms under a model of data streams that we introduce here. We give a number of upper and lower bounds for problems stemming from queryprocessing, invoking in the process tools from the area of communication complexity.

متن کامل

The edge tenacity of a split graph

The edge tenacity Te(G) of a graph G is dened as:Te(G) = min {[|X|+τ(G-X)]/[ω(G-X)-1]|X ⊆ E(G) and ω(G-X) > 1} where the minimum is taken over every edge-cutset X that separates G into ω(G - X) components, and by τ(G - X) we denote the order of a largest component of G. The objective of this paper is to determine this quantity for split graphs. Let G = (Z; I; E) be a noncomplete connected split...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Knowledge Discovery From Data

سال: 2023

ISSN: ['1556-472X', '1556-4681']

DOI: https://doi.org/10.1145/3591468